Pipeline tool for data science in R
Data analysis workflow with dependencies and multiple outcomes.
Change in one component requires updates to part or the whole workflow.
Pipeline tool for statistics and data science in R
Reproducible workflows avoiding repetition
Skips costly running time for up to date tasks
Code has long runtimes (slow or complex)
Interconnected tasks with dependencies
Different outputs (e.g. presentation and report)
Code
_target.R file
Data
make.R file
Manuscript
The targets file configures and defines the pipeline.
load packages (targets ++)
source functions
The use_targets() function can set up the targets file.
list of target objects (data, result, figure)
create target objects with tar_target()
Each target is a step of the analysis and will be stored as a value in the _targets/objects/
Separate make.R file:
Use tar_manifest(fields = all_of("command")) to check for errors.
And tar_visnetwork() to visualise the dependency graph.
Start small and build on it.
Add small steps, one at the time, check and add the next step.
Download this template repository.
get it running
do changes to it
GitHub page
Targets user manual
Targets workflow examples
Open, reproducible, and transparent science course